International Journal of Electrical and Computer Engineering (IJECE) 
Vol. 8, No. 6, December 2018, pp. 4593~4602 
ISSN: 2088-8708, DOI: 10.1159 1/ijece.v8i6.pp4593-4602 O 4593 


Hybrid Multilevel Thresholding and Improved Harmony 
Search Algorithm for Segmentation 


Erwin’, Saparudin?, Wulandari Saputri? 


'.3Department of Computer Engineering, University of Sriwijaya, Indonesia 
Department of Informatic Engineering, University of Sriwijaya, Indonesia 


Article Info 


ABSTRACT 


Article history: 


Received Apr 9, 2018 
Revised Jul 12, 2018 
Accepted Jul 22, 2018 


Keyword: 


Image segmentation 
Improved harmony search 


This paper proposes a new method for image segmentation is hybrid 
multilevel thresholding and improved harmony search algorithm. Improved 
harmony search algorithm which is a method for finding vector solutions by 
increasing its accuracy. The proposed method looks for a random candidate 
solution, then its quality is evaluated through the Otsu objective function. 
Furthermore, the operator continues to evolve the solution candidate circuit 
until the optimal solution is found. The dataset used in this study is the retina 
dataset, tongue, lenna, baboon, and cameraman. The experimental results 
show that this method produces the high performance as seen from peak 
signal-to-noise ratio analysis (PNSR). The PNSR result for retinal image 
averaged 40.342 dB while for the average tongue image 35.340 dB. For 


Algorithm lenna, baboon and cameramen produce an average of 33.781 dB, 33.499 dB, 

Multilevel thresholding and 34.869 dB. Furthermore, the process of object recognition and 
identification is expected to use this method to produce a high degree of 
accuracy. 
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1. INTRODUCTION 

Segmentation is a process in image processing that processes the original image into constituent or 
object areas. The purpose of segmentation is to separate an object from the whole image. Currently, image 
processing can be applied very widely in various fields, for example in the fields of astronomy, archeology, 
and even biomedical. Image processing on biomedicine has been widely used, including face detection [1], 
iris [2], ear, and tongue. Using image processing technique such as level set and region growing, an 
ophthalmologist may know the disease through eye retina and the technology can know the disease in the eye 
retina [3]. One of research conducted by [4] aims to classify types of diseases through the color of the tongue 
with a success rate of 91.99% accuracy. 

Another research on tongue image segmentation was done by [5] with 70% accuracy and [6] using 
active contour model method which gives 75% of accuracy. Additionally, [7] combines region-based and 
edge-based methods in segmenting images. Detection and classification of the retinal changes for Diabetic 
Retinopathy monitoring were performed by [8]. This research extract retrospective changes in longitudinal 
crack and tyrosinetopathy showing 97% detection rate and 99.3% classification rate. Reseach from [9] has 
been conducted to automatic segmentation and identification of diabetics through retinal vessels. 

The method used for segmentation is Gabor wavelet transformation. Therefore, the results obtained 
that traditional features do not detect early proliferative retinopathy. For the classification method used is the 
wavelet method that is able to group the retinal blood vessels in accordance with the presence or absence of 
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proliferative retinopathy. The success rate of this method decreased to 50% for the identification from the 
previous stages. Based on the description above it is necessary to optimize segmentation process, in order to 
improve the quality of segmentation results. 

The threshold method is one of the most widely used methods of image segmention [10],[11]. 
Segmentation process for two segmen of image is called bilevel thresholding and more than two segmen is 
called multilevel thresholding. Otsu and Kapur are two classical methods of multilevel thresholding. 
Segmentation using multilevel thresholding requires a long computation time and involves a large 
calculation. In order to solve this problem, optimization methods should be applied [12]. Several 
optimization methods that have been successfully applied in multilevel thresholding, including Genetic 
Algorithms (GA) and Particle Swarm Optimization (PSO). One interesting example of multilevel 
thresholding with GA is shown in [13], [14]. Furthermore, [15] using multilevel thresholding and Improved 
Differential Search Algorithm (IDSA) to the segment. Image segmentation with PSO-based multilevel 
thresholding was done by [16]-[18]. 

Harmony search algorithm (HSA) is an optimization algorithm inspired by the improvisation 
process of jazz musicians, from the phenomenon of opera music that consists of various musical instruments 
and produces beautiful melodies. The algorithm was introduced by Geem et al. [19] [20]. In previous 
research, multilevel thresholding with HSA was used in segmenting the image with two methods of 
thresholding, ie Otsu, and Kapur. Multilevel thresholding with Otsu that has been optimized using HSA 
shows better results compared to Kapur [19]. Improved harmony search algorithm (IHSA) has also been 
applied in segmentation problems for brain images [21]. The study uses IHSA and combines it with fuzzy 
clustering algorithms. However, there remains a weakness in the study is shown with the PSNR value that is 
not high enough. 

Research on retinal image segmentation has been done by [22], [23] aiming to simplify or change 
image representation into something that is easier to analyze. [24] who have conducted a retinal image study 
by proposing a computerized technique for extracting retinal vessels. In addition, [25] and [26] conducted a 
retinal image study for classification segmentation of retinal disease types using different methods. [27] 
segmenting the retinal vessels using a single oriented mask filter. 

The experimental results show that the proposed method outperforms a single oriented mask 
filter [28]. Segmenting the area around the retina by using adaptive superpixalation that is used to detect the 
disease around the retina area. Experimental evaluation gives better results with 96% accuracy [29]. 
Conducted a study to identify the early diagnosis of epileptic diseases of glaucoma, diabetic retinopathy, 
macular degeneration, hypertensive retinopathy, and arteriosclerosis. There are two methods of doing this 
segmentation by using the method of extraction of blood vessel centerline pixels and iterative region 
growing. 

This reseach proposed a novel method to improve segmentation performance using multilevel 
thresholding with HSA and IHSA. This paper introduces a new method of hybrid multilevel thresholding and 
improved harmony search algorithm (MT-IHSA). The parameters used differ from the HSA that lies in the 
adjustment of pitch adjusting rate (PAR) and bandwidth (BW) [21]. Where this method combines IHSA 
method and thresholding using Otsu. This research is expected to show more optimal results of PSNR than 
previous research. 


2. OTSU MULTILEVEL THRESHOLDING FOR IMAGE SEGMENTATION 

Image segmentation is a process to separate the image to the foreground and background so it is 
easier to analyze [30]. The process of image segmentation is very important, the higher the accuracy level 
generated at the segmentation stage the better the object recognition process [31]. Thresholding is known as a 
non-linear operation that is important in image segmentation [32]. The basic idea of thresholding is to choose 
an optimal gray-level threshold value to separate objects and backgrounds based on gray-level 
distribution [33]. 

There are two types of thresholding, ie global and local. Otsu is a global thresholding introduced by 
Otsu in 1979 [34]. This method is widely used because it includes a simple and effective method. Otsu uses 
the maximum variance value of class differences as the image segmenting criteria. By taking the intensity 
level (L) of the grayscale or RGB image, the probability distribution of the intensity value of the image can 
be calculated as follows Equation 1 [19]: 


hf 
Phe = —£, NF Phe = 1, (1) 


where; i=intensity level (0 < i < L-1) 
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c=component of the image, depending on grayscale or RGB 
NP=the number of pixels in the image 
hf=histogram (the number of pixels corresponding to the intensity level in c) 
Ph§=probability of distribution 
The simplest bilevel segmentation can be defined as follows Equation 2: 


C= PhẸ Phê, c _ PRings Pht 
= = 


(2) 


6th) “a 6th) me with) H wTth) 
Where w, (th) and w,(th) is the distribution probability for C; and C2 where : 


wo(th) = ei Phf, wi (th) = Dioni Phi 


us = ytt iPh§ uf = 3h iPhF 
0 i=1 wS(th)’ 1 i=th+1 wS(th) 


Otsu Variance between classes can be calculated by following Equations 3 and 4 as follows: 


o? = of +08, (3) 


of = wG (us + HF)? of = wf (hf + Hf)? Sa 


Where u$ = Wous + wiu{ and w§ + wf = 1. 
The following is an objective function based on the value ofof and o$: 


J (TH) = max(o”" (TH)) (5) 


With 0 < th; < L—1,i = 1,2,...,k, where th = thy, tho, ...,thy_1 is a vector containing several thresholds 
and then the variance is computed as Equation 6 as: 


of = ae: (6) 


3. IMPROVED HARMONY SEARCH ALGORITHM 

HSA is a new metaheuristic optimizer introduced by Zong Woo Geem, Joong Hoon Kim, and G.V. 
Loganathan in 2001, this method yielded very good results in the field of optimization [32]. HSA is inspired 
by improvised jazz musicians, from the phenomenon of opera music and produces beautiful melodies. The 
advantages of HSA compared to other optimization techniques are: HSA is a metaheuristic algorithm and 
does not require configuration values based on determinant variables, HSA uses stochastic random searches, 
HSA does not require derivative information, has several parameters, and can be easily adopted in a wide 
range of optimization problems. The steps in the HSA process are as follows [21]: 


a. Minimalize f (x) subject to x; E€ X;=1,2, ....N 
where; f (x)=objective function 
x=collection of decision variables X; 
N=total of decision variables 
X;=collection of probability range for each decision variables 
In this step, the HSA parameters are specified. HSA parameters consist of the number of solution 
vectors in harmony memory (HM) called harmony memory size (HMS), harmony memory consideration rate 
(HMCR), pitch adjusting rate (PAR), and termination criteria called a number of improvisations (NI). Here 
are the parameters in the HSA [33]: 
HMS: total vectors simultaneously in Harmony Memory (HM). Values vary from 1 to 1000. 
HMCR: the level or percentage of HSA values taken from HM, the value varies from 0.7 to 0.99 
PAR: pointer at the level or percentage of the close value, the value varies from 0.1 - 0.5 
Number of NI: indicates the iteration number in the optimization algorithm. 
b. Harmony Memory (HM) Initialization 
At this stage, the HM matrix is filled with HMS which is the solution vector randomized by 
Equation 7. 
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1 1 1 1 
Xi X2 `“  XN-1 XN 


2 2 2 2 
Xi X2 `“ XN-1 XN 
HM = : : e : : (7) 
HMS-1 „HMS-1 HMS-1 „HMS-1 
xy x2 = XN-1 XN 
HMS HMS HMS HMS 
xy x2 st XN-i XN 


c. Improvisation of New Harmony 

Improvisation is processed to obtain new harmony. The new vector harmonies can be obtained 

under the following rules: 

a) Choose one of the values of HSA Harmony (HMCR) 

b) Select one value closest to HSA memory (tone adjustment) 

c) Select a random value from a range of possible values (randomization) 

In memory considerations, first determine the value of the variable (x}) for a new vector taken from 
one of the values in the predefined HM range (x'} — x’7?™5), The value of the other determinant variables is 
picked in the same way. HMCR (0 to 1) is the step of selecting a random value from a possible range value. 
At this stage, HM considerations, tone adjustments, or random selection are applied alternately for each new 
harmony vector variable. 

d. Update HM 

In this step, if the new harmony vector is better than the existing harmony in HM rather than based 
on the value of the objective function, the new harmony can enter HM, and the worst harmony will not be 
included in HM. 

e. Check for termination criteria 

If the termination criteria are met (maximum NI) of the computation process will be stopped. If not, 
repeat steps 3 and 4. The main difference between HSA and IHSA is on the PAR and BW adjustment path. 
IHSA improves the performance of HSA algorithms and eliminates weak points. This method uses PAR and 
BW in step 3 (improvisation). Pseudocode for original IHSA algorithm: 

Step 1. Initialize parameters HMS, HMCR, c, PAR max, BWinin, BWinax, and NI. 
Step 2. Initialize HM and calculate f(x) of each harmony vector. 
Step 3. Improvise new harmony. 


for iteration < number of variable 


PAR=PAR nin + "mit? x gn 


c=in(BWmin/ BWmax)/ NI 
BW= BWnax X exp(c x gn) 
for(all variable) 
if rand() < HMCR 
xi = x G=1, 2, ..., HMS) (choose value from HM) 
if rand() < PAR 
xi = x; +rand() x BW 
endif 
else 
(choose a random value of variable) 
xi = PVBiower + rand() x (PV Bupper- PV Biower) 
endif 
endfor 
endfor 
Step 4.Update HM. 
if(new solution < worst solution) 
replace the worst harmony in HM with the new harmony 
endif 
Step 5.Check stopping criteria. If NI is completed, terminate computation; otherwise go back to Step 3. 


4. PROPOSED METHOD: HYBRID MULTILEVEL THRESHOLDING AND IMPROVED 
HARMONY SEARCH ALGORITHM (MT-IHSA) 
This method is a hybrid between two stratified thresholding methods by Otsu and IHSA, which is 
MT-IHSA. The proposed method is to search randomly in the histogram as a candidate, then evaluate its 
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quality by using the objective function by Otsu. Furthermore, IHSA operators will evolve on candidate 
strings until the most optimal solution is found. Pseudocode for MT-IHSA algorithm : 

Step 1. Obtain histograms. 

Step 2. Calculate the probability distribution. 

Step 3. Initialize the IHSA parameters: HMS, HMCR, c, PARmax, BWmin, BWmax, and NI. 

Step 4. Initialize a Harmony Memory (HM). 

Step5 Calculate Otsu variance. 

Step 6. Evaluate objective function 

Step 7. Improvise a new harmony. 

Step 8 Update the HM 

Step 9. If NI is completed or the stop criteria is satisfied, then jump to Step 10, otherwise go back to Step 6. 
Step 10.Select the harmony that has the best objective function value. 

Step 11 Apply the best thresholds values to the image. 


5. RESULT AND DISCUSSION 

The dataset used in this experiment consisted of two datasets, the retinal dataset obtained from 
STructured Analysis of the Retina (Stare) with 450 retina images and tongue dataset obtained from biometric 
research center (BRC) with 12 jpg format the image of the tongue. In addition, lenna, baboon, and 
cameraman images are also used for testing the proposed method. The parameters and values used in the MT- 
IHSA presented in Table 1 consist of NI, HMS, HMCR, PAR Min, PAR Max, BW Min, and BW Max. The 
value of the parameters using [19], namely: 


Table 1. Parameters used in MT-IHSA 


Parameters Values 


NI 2,000 
HMS 5 
HMCR 0.9 


PAR Min 0.01 
PAR Max 0.99 
BW Min 0.001 
BW Max 0.1 


Table 2 is the result of image segmentation using MT-IHSA with threshold value 5, a histogram of 5 
types of the image shows a very significant difference. In the original image, the resulting histogram still has 
red, green, blue (RGB). For the grayscale image, the resulting histogram has a gray color, but the resulting 
color still has a very high color so it is still difficult to distinguish between foreground and background. The 
third histogram is the resulting histogram for image implementation of the MT-IHSA. The colors produced 
after going through the MT-IHSA process have very few color components. Because the colors with applied 
MT-IHSA colors applied are more likely to binary. 

Table 2 shows the results of the application of segmentation using multilevel thresholding. Before 
applying the MT-IHSA first the histogram value of the original image is taken in order to see the color pixel 
values contained in the original image. Then do the grayscale process to reduce the pixels contained in RGB 
color. The histogram obtained from the grayscale has fewer pixel values. Then the application of 
segmentation using MT-IHSA. The pixels obtained are very low. Background and foreground are completely 
separated. Although the resulting pixels are very low, the resulting image quality is very good and is seen 
more clearly using MT-IHSA. 

The threshold value specified in this testing process is th=2,3,4,5. PSNR is the value of comparison 
between the maximum pixel value of the image using the mean square error (MSE). MSE is the average error 
value between the segmented image and the original image. The greater the PSNR results show better image 
quality. PSNR is expressed in decibels (dB). The value of PSNR can be categorized well if >=30 dB, it can 
be formulated as follows with Equation 8 and root mean square error (RMSE) value with Equation 9: 


255 
PSNR = 20logyo (=) (8) 


Wie DIS GD le EI) 


row x col 


RMSE = (9) 
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where; [§=original image 
If,=segmented image 
row x col=Total amount of image rows and columns 

Another goal of using PSNR is to evaluate the similarity between the segmented image and the 
original image. Comparison of PNSR results using retinal image and tongue image in Table 3. PNSR on a 
retinal image is greater than with PNSR of tongue image. 

In Table 3, the results of the application of segmentation on retinal image and tongue have been 
obtained. PSNR generated above 30 dB. That is, the result of segmentation is done successfully because it 
has exceeded 30 dB. The bigger the PNSR gets, the better the pixel gets. Segmentation performed on the 
tongue image receives a lower PSNR value than that of the retinal image. 


Table 3. Comparison of PSNR Value and Segmentation Threshold Results with MT-IHSA 
Using Retina Image and Tongue Image 


PNSR ; PNSR 
Image th image 
Retina Image Tongue Image Retina Image Tongue Image 
2 43.36 35.558 35.578 34.88 
1 3 45.121 36.074 7 39.68 35.617 
4 45.121 37.198 40.349 36.061 
3 48.131 38.015 41.141 37.173 
2 41.141 33.745 39.68 31.992 
2 3 42.11 34.761 8 42.11 34.857 
4 43.36 35.482 45.121 36.426 
5 45.121 36.013 48.131 36.426 
2 40.349 34.674 38.588 32.037 
3 3 39.68 35.468 9 39.68 34.279 
4 42.11 36.035 40.349 35.288 
5 40.349 37.138 39.68 35.936 
2 41.141 34.76 39.09 31.595 
4 3 42.11 35.657 10 37.339 32.908 
4 43.36 36.377 38.131 35.306 
5 45.121 37.307 39.68 35.926 
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Table 3. Comparison of PSNR Value and Segmentation Threshold Results with MT-IHSA 
Using Retina Image and Tongue Image 


PNSR ; PNSR 
Image th image 
Retina Image Tongue Image Retina Image Tongue Image 
2 33.079 35.109 38.131 33.115 
5 3 34.514 36.012 11 38.131 34.594 
4 36.991 36.806 39.68 35.653 
3 37.339 37.635 41.141 36.082 
2 39.1 33.673 33.981 34.419 
6 3 40.349 33.822 12 34.707 35.032 
4 41.141 34.814 36.67 36.032 
5 42.11 35.56 37.339 36.981 


Segmentation from [28] is using single oriented mask filter, the result of the segmentation has 
increased but the resulting weight has not been maximized. The disadvantage of this method is when 
processing the results, the time is done long enough so that the process is slower. As for the process of MT- 
IHSA, process data is processed faster and the results obtained exceeds 30 dB. The next method used for 
segmentation is by using the Gabor wavelet transformation. But for the results obtained that traditional 
features do not detect early proliferative retinopathy. Percentage of success is only 50%. Of all the methods 
described, we can see the comparison that MT-IHSA is an excellent method for segmentation process. 

Figure 1 is the result of the segmentation comparison by using the multi thresholding harmony 
search algorithm (MT-HSA) performed by [19] with the MT-IHSA proposed method using lenna, baboon, 
and cameramen images. Results from MT-HSA, PNSR values obtained below 30 dB. The pixel values 
obtained do not match the default value of 30 dB. This means that the implementation using MT-HSA 
segmentation is not appropriate and is still below the average. However, for segmentation results using MT- 
IHSA, the segmentation obtained exceeds the 30 dB limit. Segmentation with the application of MT-IHSA 
shows good quality and success because the pixels produced are excellent and also exceed 30 dB. 

To compare it using other methods performed by [35] using the method of Tongue Color, Texture, 
and Geometry Features(CTGF), [19] with Multilevel Thresholding harmony search algorithm (MT-HSA) 
method, and [36] with Multilevel Thresholding Firefly Algorithm method (MT-FA) and Multilevel 
Thresholding Social Spider Algorithm (MT-SSA) are presented as Figure 2. 
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Figure 1. Diagram PSNR comparison MT-HSA and MT-IHSA 


So the comparison of PNSR results between the proposed method (MT-IHSA) with MT-HSA, 
Color, Texture, and Geometry (CTGF), MT-FA and MT-SSA methods resulted in the highest PNSR score. 
Increasing the value of PNSR shows that the results of image segmentation with the proposed method 
produce the best segmentation quality compared with other methods. 
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Figure 2. Diagram PSNR comparison of comparison of the PNSR value of segmentation method of MT- 
IHSA, MT-HSA, CTGF, MT-FA, MT-SSA with MT-IHSA 


6. CONCLUSION 

In this study, we propose a new method of multilevel threshold hybrid with improved harmony 
search algorithm. This method is a hybrid between the IHSA algorithm and the objective function of 
multilevel thresholding using the Otsu method. The indicator used in this study to evaluate the performance 
of MT-IHSA is PSNR. The results of MT-IHSA experiments implemented on the retina image are higher 
than the image of the tongue image, but for the results obtained, the image of the tongue produces excellent 
segmentation compared to the retinal image. Likewise, for the image of Lena, baboon, and cameraman, 
PNSR produced after applying MT-IHSA increased. Previously for the image of Lenna, baboon, and 
cameramen applied using MT-HSA with PNSR result below 30dB. The comparison of segmentation using 
other method yields the PNSR value using the highest MT-IHSA. The level of illumination of an object is 
very influential for segmentation so that the results obtained more clearly. For the tongue image results 
showed better PSNR results than previous studies higher than 30 dB. 

The PNSR result for retinal image averaged 40,342 dB while for the average tongue image 35.340 
dB. For lenna, baboon and cameraman produce average PNSR 33.781 dB, 33.499 dB and 34.869 dB 
respectively. Furthermore, the process of object recognition and identification is expected to use this method 
to produce a high degree of accuracy 
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